156 research outputs found

    The Symbiotic Relationship Between Information Retrieval and Informetrics

    Get PDF
    Informetrics and information retrieval (IR) represent fundamental areas of study within information science. Historically, researchers have not fully capitalized on the potential research synergies that exist between these two areas. Data sources used in traditional informetrics studies have their analogues in IR, with similar types of empirical regularities found in IR system content and use. Methods for data collection and analysis used in informetrics can help to inform IR system development and evaluation. Areas of application have included automatic indexing, index term weighting and understanding user query and session patterns through the quantitative analysis of user transaction logs. Similarly, developments in database technology have made the study of informetric phenomena less cumbersome, and recent innovations used in IR research, such as language models and ranking algorithms, provide new tools that may be applied to research problems of interest to informetricians. Building on the author’s previous work (Wolfram 2003), this paper reviews a sample of relevant literature published primarily since 2000 to highlight how each area of study may help to inform and benefit the other

    Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2017)

    Full text link
    The large scale of scholarly publications poses a challenge for scholars in information seeking and sensemaking. Bibliometrics, information retrieval (IR), text mining and NLP techniques could help in these search and look-up activities, but are not yet widely used. This workshop is intended to stimulate IR researchers and digital library professionals to elaborate on new approaches in natural language processing, information retrieval, scientometrics, text mining and recommendation techniques that can advance the state-of-the-art in scholarly document understanding, analysis, and retrieval at scale. The BIRNDL workshop at SIGIR 2017 will incorporate an invited talk, paper sessions and the third edition of the Computational Linguistics (CL) Scientific Summarization Shared Task.Comment: 2 pages, workshop paper accepted at the SIGIR 201

    Formalised data citation practices would encourage more authors to make their data available for reuse

    Get PDF
    It is increasingly common for researchers to make their data freely available. This is often a requirement of funding agencies but also consistent with the principles of open science, according to which all research data should be shared and made available for reuse. Once data is reused, the researchers who have provided access to it should be acknowledged for their contributions, much as authors are recognised for their publications through citation. Hyoungjoo Park and Dietmar Wolfram have studied characteristics of data sharing, reuse, and citation and found that current data citation practices do not yet benefit data sharers, with little or no consistency in their format. More formalised citation practices might encourage more authors to make their data available for reuse

    The Influence of Effects and Phenomena on Citations: A Comparative Analysis of Four Citation Perspectives

    Get PDF
    This article defines different perspectives for citations and introduces four concepts: Self-expected Citations, Received Citations, Expected Citations, and Deserved Citations. When comparing permutations of these four classes of perspectives, there are up to 145 kinds of equality/inequality relations. From these numerous relations, we analyze the difference between the Matthew Effect and the Matthew Phenomenon. We provide a precise definition and point out that many previous empirical research studies on the Matthew Effect based on citations belong primarily to the Matthew Phenomenon, and not the true meaning of the Matthew Effect. Due to the difficulty in determining the Deserved Citations, the Matthew Effect is in itself difficult to measure, although it is commonly believed to influence citation counts. Furthermore, from the theoretical facts, we outline four new effects/phenomena: the Self-confidence Effect/Phenomenon, the Narcissus Effect/Phenomenon, the Other-confidence Effect/Phenomenon, and the Flattery Effect/Phenomenon, and we discuss additional influencing factors

    Measuring Author Research Relatedness: A Comparison of Word-based,Topic-based and Author Cocitation Approaches

    Get PDF
    Relationships between authors based on characteristics of published literature have been studied for decades. Author cocitation analysis using mapping techniques has been most frequently used to study how closely two authors are thought to be in intellectual space based on how members of the research community co-cite their works. Other approaches exist to study author relatedness based more directly on the text of their published works. In this study we present static and dynamic word-based approaches using vector space modeling, as well as a topic-based approach based on Latent Dirichlet Allocation for mapping author research relatedness. Vector space modeling is used to define an author space consisting of works by a given author. Outcomes for the two word-based approaches and a topic-based approach for 50 prolific authors in library and information science are compared with more traditional author cocitation analysis using multidimensional scaling and hierarchical cluster analysis. The two word-based approaches produced similar outcomes except where two authors were frequent co-authors for the majority of their articles. The topic-based approach produced the most distinctive map

    An Examination of Research Data Sharing and Re-Use: Implications for Data Citation Practice

    Get PDF
    This study examines characteristics of data sharing and data re-use in Genetics and Heredity, where data citation is most common. This study applies an exploratory method because data citation is a relatively new area. The Data Citation Index (DCI) on the Web of Science was selected because DCI provides a single access point to over 500 data repositories worldwide and to over two million data studies and datasets across multiple disciplines and monitors quality research data through a peer review process. We explore data citations for Genetics and Heredity, as a case study by examining formal citations recorded in the DCI and informally by sampling a selection of papers for implicit data citations within publications. Citer-based analysis is conducted in order to remedy self- citation in the data citation phenomena. We explore 148 sampled citing articles in order to identify factors that influence data sharing and data re-use, including references, main text, supplementary data/information, acknowledgments, funding information, author information, and web/author resources. This study is unique in that it relies on a citer-based analysis approach and by analyzing peer-reviewed and published data, data repositories, and citing articles of highly productive authors where data sharing is most prevalent. This research is intended to provide a methodological and practical contribution to the study of data citation

    Information Science: The Canadian Context

    Get PDF
    This study outlines the development of information science in Canada through an examination of the research contributions to the field by Canadians. Identification of contributors, their backgrounds and the sub-areas of information science to which they have contributed indicate that the field continues to draw inter-disciplinary contributions from a wide variety of academic fields. Despite the tenuous self-identity of the discipline, the development of Canadian periodicals as well as undergraduate and graduate education programmes devoted to information science indicate an increased recognition of the existence and validity of the discipline.L’examen des contributions canadiennes dans le domaine de la science de l’information permet de retracer de développement de cette discipline au Canada. L’identification des contributeurs, de leur formation et de leurs domaines de spécialisation met en évidence le caractère inter-disciplinaire des contributions à la science de l’information. Malgré une identité disciplinaire peu développée, l’apparition de revues canadiennes et de programmes d’enseignement à tous les cycles universitaires, indique que la science de l’information est une discipline dont l’existence et la légitimité sont de plus en plus reconnues

    Open Peer Review: The Current Landscape and Emerging Models

    Get PDF
    Open peer review (OPR) is an important innovation in the open science movement. OPR can play a significant role in advancing scientific communication by increasing its transparency. Despite the growing interest in OPR, adoption of this innovation since the turn of the century has been slow. This study provides the first comprehensive investigation of OPR adoption, its early adopters and the implementation models used. We identified 174 current OPR journals and analysed their wide-ranging implementations to derive emerging OPR models. The findings suggest that: 1) there has been a steady growth in OPR adoption since 2001 when 38 journals initially adopted OPR; 2) OPR adoption is most prevalent in medicine and the natural sciences; 3) three publishers are responsible for 87% of identified OPR journals; 4) early adopter publishers have implemented different models of OPR resulting in different levels of transparency. Across the variations in OPR implementations, two important factors define the degree of transparency: open identities and open reports. Open identities may include reviewer names and affiliation as well as credentials; open reports may include timestamped review histories consisting of referee reports and author rebuttals. When and where open reports can be accessed are also important factors indicating the OPR transparency level. Dimensions that characterize the observed OPR models are outlined

    A Survey on Point-of-Interest Recommendations Leveraging Heterogeneous Data

    Full text link
    Tourism is an important application domain for recommender systems. In this domain, recommender systems are for example tasked with providing personalized recommendations for transportation, accommodation, points-of-interest (POIs), or tourism services. Among these tasks, in particular the problem of recommending POIs that are of likely interest to individual tourists has gained growing attention in recent years. Providing POI recommendations to tourists \emph{during their trip} can however be especially challenging due to the variability of the users' context. With the rapid development of the Web and today's multitude of online services, vast amounts of data from various sources have become available, and these heterogeneous data sources represent a huge potential to better address the challenges of in-trip POI recommendation problems. In this work, we provide a comprehensive survey of published research on POI recommendation between 2017 and 2022 from the perspective of heterogeneous data sources. Specifically, we investigate which types of data are used in the literature and which technical approaches and evaluation methods are predominant. Among other aspects, we find that today's research works often focus on a narrow range of data sources, leaving great potential for future works that better utilize heterogeneous data sources and diverse data types for improved in-trip recommendations.Comment: 35 pages, 19 figure

    Open Peer Review in Scientific Publishing: A Web Mining Study of PeerJ Authors and Reviewers

    Get PDF
    Purpose: To understand how authors and reviewers are accepting and embracing Open Peer Review (OPR), one of the newest innovations in the open science movement. Design: This research collected and analyzed data from the Open Access journal PeerJ over its first three years (2013-2016). Web data were scraped, cleaned, and structured using several Web tools and programs. The structured data were imported into a relational database. Data analyses were conducted using analytical tools as well as programs developed by the researchers. Findings: PeerJ, which supports optional OPR, has a broad international representation of authors and referees. Approximately 73.89% of articles provide full review histories. Of the articles with published review histories, 17.61% had identities of all reviewers and 52.57% had at least one signed reviewer. In total, 43.23% of all reviews were signed. The observed proportions of signed reviews have been relatively stable over the period since the journal’s inception. Limitations: This research is constrained by the availability of the peer review history data. Some peer reviews were not available when the authors opted out of publishing their review histories. The anonymity of reviewers made it impossible to give an accurate count of reviewers who contributed to the review process. Implications: These findings shed light on the current characteristics of OPR. Given the policy that authors are encouraged to make their articles’ review history public and referees are encouraged to sign their review reports, the three years of PeerJ review data demonstrate that there is still some reluctance by authors to make their reviews public and by reviewers to identify themselves. Originality/Value: This is the first study to closely examine PeerJ as an example of an OPR model journal. As open science moves further towards open research, OPR is a final and critical component. Research in this area must identify the best policies and paths towards a transparent and open peer review process for scientific communication
    • …
    corecore